Image Preprocessing For Geometric Feature Extraction in OCR Systems

نویسنده

  • Safoora O.K
چکیده

Optical character recognition (OCR) is one of the most successful application of pattern recognition and image processing. Character geometry is one of the most useful feature for identifying characters in images. The geometric feature extraction techniques proposed in literature are complex and requires extensive effort in implementation. In this paper, we propose a preprocessing technique which is simple and makes the feature extraction process easy. The proposed system is intended for preprocessing colour images containing printed English alphabets. The image is fed to a preprocessing unit where it is first converted to gray and then thresholded to obtain a binary image. Next step is to filter the noise present in the image. Now the area containing text will be identified and segmented. Each character is extracted from the segmented area in the image and apply morphological thinning operation to make the feature extraction process easy. Keywords— Geometric feature extraction, Optical character recognition, Preprocessing, Structural analysis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Preprocessing Techniques in Character Recognition

The advancements in pattern recognition has accelerated recently due to the many emerging applications which are not only challenging, but also computationally more demanding, such evident in Optical Character Recognition (OCR), Document Classification, Computer Vision, Data Mining, Shape Recognition, and Biometric Authentication, for instance. The area of OCR is becoming an integral part of do...

متن کامل

Document Image Dewarping Based on Text Line Detection and Surface Modeling (RESEARCH NOTE)

Document images produced by scanner or digital camera, usually suffer from geometric and photometric distortions. Both of them deteriorate the performance of OCR systems. In this paper, we present a novel method to compensate for undesirable geometric distortions aiming to improve OCR results. Our methodology is based on finding text lines by dynamic local connectivity map and then applying a l...

متن کامل

Optical Character Recognition Systems

Abstract Optical character recognition (OCR) is process of classification of optical patterns contained in a digital image. The character recognition is achieved through segmentation, feature extraction and classification. This chapter presents the basic ideas of OCR needed for a better understanding of the book. The chapter starts with a brief background and history of OCR systems. Then the di...

متن کامل

Image preprocessing for optical character recognition using neural networks

Primary task of this master’s thesis is to create a theoretical and practical basis of preprocessing of printed text for optical character recognition using forward-feed neural networks. Demonstration application was created and its parameters were set according to results of realized experiments. Project definition and task determination 1. Write a introduction about the problematics of optica...

متن کامل

Optical Character Recognition for Hindi Language Using a Neural-network Approach

Hindi is the most widely spoken language in India, with more than 300 million speakers. As there is no separation between the characters of texts written in Hindi as there is in English, the Optical Character Recognition (OCR) systems developed for the Hindi language carry a very poor recognition rate. In this paper we propose an OCR for printed Hindi text in Devanagari script, using Artificial...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015